Reinforcement learning of a simple control task using the spike response model
نویسندگان
چکیده
In this work, we propose a variation of a direct reinforcement learning algorithm, suitable for usage with spiking neurons based on the spike response model (SRM). The SRM is a biologically inspired, flexible model of spiking neuron based on kernel functions that describe the effect of spike reception and emission on the membrane potential of the neuron. In our experiments, the spikes emitted by a SRM neuron are used as input signals in a simple control task. The reinforcement signal obtained from the environment is used by the direct reinforcement learning algorithm, that modifies the synaptic weights of the neuron, adjusting the spiking firing times in order to obtain a better performance at the given problem. The obtained results are comparable to those from classic methods based on value function approximation and temporal difference, for simple control tasks. r 2006 Elsevier B.V. All rights reserved.
منابع مشابه
Mini/Micro-Grid Adaptive Voltage and Frequency Stability Enhancement Using Q-learning Mechanism
This paper develops an adaptive control method for controlling frequency and voltage of an islanded mini/micro grid (M/µG) using reinforcement learning method. Reinforcement learning (RL) is one of the branches of the machine learning, which is the main solution method of Markov decision process (MDPs). Among the several solution methods of RL, the Q-learning method is used for solving RL in th...
متن کاملReinforcement learning based feedback control of tumor growth by limiting maximum chemo-drug dose using fuzzy logic
In this paper, a model-free reinforcement learning-based controller is designed to extract a treatment protocol because the design of a model-based controller is complex due to the highly nonlinear dynamics of cancer. The Q-learning algorithm is used to develop an optimal controller for cancer chemotherapy drug dosing. In the Q-learning algorithm, each entry of the Q-table is updated using data...
متن کاملPair-Associate Learning with Modulated Spike-Time Dependent Plasticity
We propose an associative learning model using reward modulated spike-time dependent plasticity in reinforcement learning paradigm. The task of learning is to associate a stimulus pair, known as the predictor− choice pair, to a target response. In our model, a generic architecture of neural network has been used, with minimal assumption about the network dynamics. We demonstrate that stimulus-s...
متن کاملCycle Time Optimization of Processes Using an Entropy-Based Learning for Task Allocation
Cycle time optimization could be one of the great challenges in business process management. Although there is much research on this subject, task similarities have been paid little attention. In this paper, a new approach is proposed to optimize cycle time by minimizing entropy of work lists in resource allocation while keeping workloads balanced. The idea of the entropy of work lists comes fr...
متن کاملReinforcement Learning Through Modulation of Spike-Timing-Dependent Synaptic Plasticity
The persistent modification of synaptic efficacy as a function of the relative timing of pre- and postsynaptic spikes is a phenomenon known as spike-timing-dependent plasticity (STDP). Here we show that the modulation of STDP by a global reward signal leads to reinforcement learning. We first derive analytically learning rules involving reward-modulated spike-timing-dependent synaptic and intri...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Neurocomputing
دوره 70 شماره
صفحات -
تاریخ انتشار 2006